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Abstract 

Two modifications of the x 2 test for comparing usual (unweighted) and weighted his- 
tograms and two weighted histograms are proposed. Numerical examples illustrate 
an application of the tests for the histograms with different statistics of events. Pro- 
posed tests can be used for the comparison of experimental data histograms against 
simulated data histograms and two simulated data histograms. 
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1 Introduction 

A frequently used technique in data analysis is the comparison of histograms. 
First suggested by Pearson [1] the x 2 test of homogeneity is used widely for 
comparing usual (unweighted) histograms. The modified x 2 test for compari- 
son of weighted and unweighted histograms recently was proposed in [2]. 

This paper develops the ideas presented in [2]. From this development, two new 
results are presented. First, the x 2 test for comparing weighted and unweighted 
histograms is improved so that it can be applied for histograms with lower 
minimal number of events in a bin than is recommended in [2]. And secondly, 
a new x 2 test is proposed for the comparison two weighted histograms. 

The paper is organized as follows. In section 2 the usual x 2 test and its appli- 
cation for the comparison of usual unweighted histograms is discussed. Tests 
for the comparison of weighted and unweighted histograms and two weighted 
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histograms are proposed in sections 3 and 4 respectively. In section 5 the tests 
are illustrated and verified by a numerical example and experiments. 



2 x 2 test for comparison two (unweighted) histograms 



Without limiting the general nature of the discussion, we consider two his- 
tograms with the same binning and the number of bins equal to r. Let us 
denote the number of events in the ith bin in the first histogram as rii and 
as rrii in the second one. The total number of events in the first histogram is 
equal to N = Z^=i ^i, and M = YJi=\ mi in the second histogram. 

The hypothesis of homogeneity [3] is that the two histograms represent random 
values with identical distributions. It is equivalent that there exist r constants 
p±, ...,p r , such that J2l=iPi — 1) an d the probability of belonging to the ith 
bin for some measured value in both experiments is equal to p^. The number 
of events in the ith bin is a random variable with a distribution approximated 
by a Poisson probability distribution e~ Npi (Npi) ni /nj for the first histogram 
and with distribution e~ Mpi (Mpi) mi /to*! for the second histogram. If the hy- 
pothesis of homogeneity is valid, then the maximum likelihood estimator of 
Pi,i = 1, ...,r, is 

rii + rrii 

Pi = TT-T7 ' i 1 ) 



N + M 



and then 



h N Pi h M Pi MN h ni + rrn U 

has approximately a x\ r -i) distribution [3]. 

The comparison procedure can include an analysis of the residuals which is 
often helpful in identifying the bins of histograms responsible for a significant 
overall X 2 value. Most convenient for analysis are the adjusted (normalized) 
residuals [4] 

n% ~ N Pi ,a\ 

r i = / • ( 3 ) 

v ^ v /(l - N/(N + M))(l - (m + m t )/(N + M)) 

If hypotheses of homogeneity are valid then residuals are approximately 
independent and identically distributed random variables having A/"(0, 1) dis- 
tribution. Notice that residuals (3) are related with the first histogram and 
residuals related with the second histogram are: 

m - Mp, (4) 



VWi^ - M/(N + M))(l - (m + mi)/{N + M)) 
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As ri = — r[, it makes sense either to use residuals (3) or (4). 

The application of the x 2 test has restrictions related to the value of the 
expected frequencies Npi,Mpi,i = l,...,r. A conservative rule formulated in 
[5] is that all the expectations must be 1 or greater for both histograms. The 
authors point out that this rule is extremely conservative and in the majority 
of cases the \ 2 test may be used for histograms with expectations in excess of 
0.5 in the smallest bin. In practical cases when expected frequencies are not 
known the estimated expected frequencies Mp iy Np iy i = 1, ...,r can be used. 



3 Unweighted and weighted histograms comparison 



A simple modification of the ideas described above can be used for the com- 
parison of the usual (unweighted) and weighted histograms. Let us denote the 
number of events in the ith bin in the unweighted histogram as rij and the com- 
mon weight of events in the ith bin of the weighted histogram as W{. The total 
number of events in the unweighted histogram is equal to iV = Y%=i n i an d 
the total weight of events in the weighted histogram is equal to W = Yh=i Wi- 

Let us formulate the hypothesis of identity of an unweighted histogram to 
a weighted histogram so that there exist r constants pi,...,p r , such that 
Yh=iPi — 1) an d the probability of belonging to the ith bin for some mea- 
sured value is equal to pi for the unweighted histogram and expectation value 
of weight Wi equal to Wpi for the weighted histogram. The number of events in 
the ith bin is a random variable with distribution approximated by the Pois- 
son probability distribution e~ Npi (Npi) ni jn^. for the unweighted histogram. 
The weight tUj is a random variable with a distribution approximated by the 
normal probability distribution Af(Wpi, of), where of is the variance of the 
weight Wi. If we replace the variance af with estimate sf (sum of squares of 
weights of events in the ith bin) and the hypothesis of identity is valid, then 
the maximum likelihood estimator of Pi,i = 1, ...,r, is 



Wwt - Nsj + J(Wwi - iVs?) 2 + AW 2 sj ni 
Pi= 2W?5 • ^ 

We may then use the test statistic 

v2 - fa - Np t y * (wj - w Vl f 

— m. — + ^ ^ {b) 

i=i iV ^« i=i b t 

and it is plausible that this has approximately a xf r -i) distribution. 

This test, as well as the original one [3], has a restriction on the expected fre- 
quencies. The expected frequencies recommended for the weighted histogram 
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is more than 25. The value of the minimal expected frequency can be de- 
creased down to 10 for the case when the weights of the events are close to 
constant. In the case of a weighted histogram if the number of events is un- 
known, then we can apply this recommendation for the equivalent number of 
events as n^ ulv — wf/sf . The minimal expected frequency for an unweighted 
histogram must be 1. Notice that any usual (unweighted) histogram can be 
considered as a weighted histogram with events that have constant weights 
equal to 1. 

The variance zf of the difference between the weight Wi and the estimated 
expectation value of the weight is approximately equal to: 

zf = Var(wi - Wpi) = Np^l - Np " : 11 




(Ns 2 - Wi W) 2 + AW 2 s 2 i n i J 
s?/ Ns 2 - Wi W \ 2 (7) 

+4U + 



4 V ^{Ns 2 - WiW) 2 + AW 2 s 

The residuals 



2 Ui 



Wi - Wpi 

Ti = 



Zi 

have approximately a normal distribution with mean equal to and standard 
deviation equal to 1. 



4 Two weighted histograms comparison 



Let us denote the common weight of events of the ith bin in the first his- 
togram as wu and as w 2 i in the second one. The total weight of events in the 
first histogram is equal to W\ = YZ=i w n-> an d W 2 = J2l=i w 2i m the second 
histogram. 

Let us formulate the hypothesis of identity of weighted histograms so that 
there exist r constants p±, ...,p r , such that YZ=\Pi — 1? an d also expectation 
value of weight wu equal to W\Pi and expectation value of weight w^i equal to 
W 2 pi- Weights in both the histograms are random variables with distributions 
which can be approximated by a normal probability distribution M{W\Pi, of J 
for the first histogram and by a distribution A/^W^Pi, o|J for the second. 
Here o^ and a 2i are the variances of wu and w 2 i with estimators sfj and s 2i 
respectively. If the hypothesis of identity is valid, then the maximum likelihood 
and Least Square Method estimator of p iy i = 1, ...,r, is 

WgW^sl + w^/sj 
Pi W 2 /s 2 l + W 2 /s 2 2l ■ iyj 



4 



We may then use the test statistic 

y2 _ V- ( W U ~ Wxpj) 2 ^ (W2i - W 2 pi) 2 _ ^ (WiW 2i - W 2 Wiif 
i=l S li i=l S 2i j=l vv l S 2i + VV 2 S li 

and it is plausible that this has approximately a xfr-i) distribution. The nor- 
malized or studentised residuals [6] 

r = wu ~ WiPi (n , 

sny/1 ~ V(l + Whl/Whl) 

have approximately a normal distribution with mean equal to and standard 
deviation 1. A recommended minimal expected frequency is equal to 10 for 
the proposed test. 



5 Numerical example and experiments 



The method described herein is now illustrated with an example. We take a 
distribution 

^ = (x - ID) 2 + 1 + (x - 14) 2 + 1 (12) 
defined on the interval [4,16]. Events distributed according to the formula 
(12) are simulated to create the unweighted histogram. Uniformly distributed 
events are simulated for the weighted histogram with weights calculated by 
formula (12). Each histogram has the same number of bins: 20. Fig. 1 shows 
the result of comparison of the unweighted histogram with 200 events (minimal 
expected frequency equal to one) and the weighted histogram with 500 events 
(minimal expected frequency equal to 25) 

The value of the test statistic X 2 is equal to 21.09 with p-value equal to 0.33, 
therefore the hypothesis of identity of the two histograms can be accepted. 
The behavior of the normalized residuals plot (see Fig. lc) and the normal 
Q-Q plot (see Fig. Id) of residuals are regular and we cannot identify the 
outliers or bins with a big influence on X 2 . 

To investigate the dependence of the distribution of the test statistics from 
the number of events all three tests were considered. 

The comparison of pairs of unweighted histograms with different minimal ex- 
pected frequencies was considered (Pearson's chi square test). Unweighted his- 
tograms with minimal expected frequencies equal to one (200 events), 2.5 (500 
events) and 5 (1000 events) where simulated. Fig. 2 shows the Q-Q plots of 
X 2 statistics for different pairs of histograms. In each case 10000 pairs of his- 
tograms were simulated. As we can see for all cases the real distributions of 
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Fig. 1. An example of comparison of the unweighted histogram with 200 events 
and the weighted histogram with 500 events: a) unweighted histogram; b) weighted 
histogram; c) normalized residuals plot; d) normal Q-Q plot of residuals. 

test statistics are close to the theoretical X19 distribution. 

The comparison of pairs of unweighted and weighted histograms with different 
minimal expected frequencies was considered using the test proposed in sec- 
tion 3 above. Unweighted histograms with minimal expected frequencies equal 
to one (200 events), 2.5 (500 events) and 5 (1000 events) where simulated. 
Furthermore weighted histograms with minimal expected frequencies equal to 
10 (200 events), 25 (500 events) and 50 (1000 events) where simulated. Fig. 
3 shows the Q-Q plots of X 2 statistics for different pairs of histograms. As 
we can see the real distribution of test statistics obtained for minimal ex- 
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Fig. 2. Chi-square Q-Q plots of X 2 statistics for two unweighted histograms with 
different minimal expected frequencies. 

pected frequency of weighted events, equal to 10, has a heavier tail than the 
theoretical Xw distribution. This means that the p-value calculated with the 
theoretical X19 distribution is lower than the real p-value and any decision 
about the rejection of the hypothesis of identity of the two distributions is 
conservative. The distributions of test statistics for the minimal expected fre- 
quencies 25 and 50 are close to the theoretical distribution. This confirms that 
the minimal expected frequency 25 is reasonable restriction for the weighted 
histogram for this test. 

The comparison of two weighted histograms with different minimal expected 
frequencies was considered using the test proposed in section 4 above. Weighted 
histograms with minimal expected frequencies equal to 10 (200 events), 25 (500 
events) and 50 (1000 events) where simulated. Fig. 4 shows the Q-Q plots of X 2 
statistics for different pairs of histograms. As we can see the real distributions 
of the test statistics are close to the theoretical X19 distribution if the minimal 
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Fig. 3. Chi-square Q-Q plots of X 2 statistics for unweighted and weighted his- 
tograms with different minimal expected frequencies. 

expectations of the two histograms are close to each other, it is in all cases 
excluding case (10, 50). For the case when the difference in expectations are 
big (10, 50) the real distribution of the test statistics has a heavier tail than 
the theoretical Xw- 

To verify the proposed tests two further numerical experiments were per- 
formed. 

For the first case unweighted histograms with minimal expected frequencies 
equal to 10 (2000 events), 25 (5000 events) and 50 (10000 events) were simu- 
lated. These histograms were compared to an unweighted histogram with 10 
or more expected frequencies by the three methods described above. Fig. 5 
shows the Q-Q plots of X 2 statistics for different pairs of histograms. As we 
can see the real distributions of the test statistics are close to the theoretical 
xlg distribution for all three tests. 
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For the second case unweighted histograms with minimal expected frequencies 
equal to one (200 events), 2.5 (500 events) and 5 (1000 events) were simulated. 
These histograms were compared to an unweighted histogram with 10 or more 
expected frequencies by the first two methods described above. Fig. 6 shows 
the Q-Q plots of the X 2 statistics for different pairs of histograms. As we can 
see for all cases the real distributions of the test statistics are close to the theo- 
retical X19 distribution. Also the real distributions of the test statistics for the 
proposed method of comparison of unweighted and weighted histograms (see 
Fig. 6b) do not have heavy tails as is the case for a weighted histogram with 
weights calculated according formula (12) (see Fig. 3). This example confirms 
that the minimal expected frequency equal to 10 is enough for the application 
of the method of comparison of unweighted and weighted histograms if the 
weights of the events are close to a constant for the weighted histogram. 
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Fig. 5. Chi-square Q-Q plots of X 2 statistics for two unweighted histograms with 
different tests: a) Pearson's chi square test; b) proposed in this article test for un- 
weighted and weighted histograms; c) proposed in this article test for two weighted 
histograms. 

6 Conclusions 

A chi square test for comparing the usual (unweighted) histogram and the 
weighted histogram, together with a test for comparing two weighted his- 
tograms were proposed. In both cases formulas for normalized residuals were 
presented that can be useful for the identifications of bins that are outliers, or 
bins that have a big influence on X 2 . For the first test the recommended min- 
imal expected frequency of events is equal to 1 for an unweighted histogram 
and 10-25 for a weighted histogram. For the second test the recommended 
minimal expected frequency is equal to 10. Numerical examples illustrated an 
application of the method for the histograms with different statistics of events 
and confirm that the proposed restrictions related with the expectations are 
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reasonable. The proposed in this paper approach can be generalized for a 
comparison of several unweighted and weighted histograms or just weighted 
histograms. The test statistic has approximately a xf r -i)(s-i) distribution for 
s histograms with r bins. 
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